Delivery of streaming media

ABSTRACT

A method for evaluating an end-user&#39;s subjective assessment of streaming media quality includes obtaining reference data characterizing the media stream, and obtaining altered data characterizing the media stream after the media stream has traversed a channel that includes a network. An objective measure of the QOS of the media stream is then determined by comparing the reference data and the altered data.

FIELD OF INVENTION

[0001] This invention relates to delivery of streaming media.

BACKGROUND

[0002] Streaming media refers to content, typically audio, video, orboth, that is intended to be displayed to an end-user as it istransmitted from a content provider. Because the content is being viewedin real-time, it is important that a continuous and uninterrupted streambe provided to the user. The extent to which a user perceives anuninterrupted stream that displays uncorrupted media is referred to asthe “Quality of Service”, or QOS, of the system.

[0003] A content delivery service typically evaluates its QOS bycollecting network statistics and inferring, on the basis of thosenetwork statistics, the user's perception of a media stream. Thesenetwork statistics include such quantities as packet loss and latencythat are independent on the nature of the content. The resultingevaluation of QOS is thus content-independent.

BRIEF DESCRIPTION OF THE FIGURES

[0004]FIGS. 1 and 2 show content delivery systems.

DETAILED DESCRIPTION

[0005] As shown in FIG. 1, a content delivery system 10 for the deliveryof a media stream 12 from a content server 14 to a client 16 includestwo distinct processes. Because a media stream requires far morebandwidth than can reasonably be accommodated on today's networks, it isfirst passed through an encoder 18 executing on the content server 14.The encoder 18 transforms the media stream 12 into a compressed formsuitable for real-time transmission across a global computer network 22.The resulting encoded media stream 20 then traverses the global computernetwork 22 until it reaches the client 16. Finally, a decoder 24executing on the client 16 transforms the encoded media stream 20 into adecoded media stream 26 suitable for display.

[0006] In the content delivery system 10 of FIG. 1, there are at leasttwo mechanisms that can impair the media stream. First, the encoder 18and decoder 24 can introduce errors. For example, many encodingprocesses discard high-frequency components of an image in an effort tocompress the media stream 12. As a result, the decoded media stream 26may not be a replica of the original media stream 12. Second, thevagaries of network transmission, many of which are merely inconvenientwhen text or static images are delivered, can seriously impair thereal-time delivery of streaming media.

[0007] These two impairment mechanisms, hereafter referred to asencoding error and transmission error, combine to affect the end-user'ssubjective experience in viewing streaming media. However, theend-user's subjective experience also depends on one other factor thusfar not considered: the content of the media stream 12 itself.

[0008] The extent to which a particular error affects an end-user'senjoyment of a decoded media stream 26 depends on certain features ofthe media stream 12. For example, a media stream 12 rich in detail willsuffer considerably from loss of sharpness that results from discardingtoo many high frequency components. In contrast, the same loss ofsharpness in a media stream 12 rich in impressionist landscapes willscarcely be noticeable.

[0009] Referring to FIG. 2, a system 28 incorporating the inventionincludes a content-delivery server 30 in data communication with aclient 32 across a global computer network 34. The system 28 alsoincludes an aggregating server 36 in data communication with both theclient 32 and the content-delivery server 30. The link between theaggregating server 36 and the client 32 is across the global computernetwork 34, whereas the link between the aggregating server 36 and thecontent-delivery server 30 is typically over a local area network.

[0010] An encoder 38 executing on the content-delivery server 30 appliesan encoding or compression algorithm to the original media stream 39,thereby generating an encoded media stream 40. For simplicity, FIG. 2 isdrawn with the output of the encoder 38 leading directly to the globalcomputer network 34, as if encoding occurred in real-time. Although itis possible, and sometimes desirable, to encode streaming media inreal-time (for example in the case of video-conferencing applications),in most cases encoding is carried out in advance. In such cases, theencoded media 40 is stored on a mass-storage system (not shown)associated with the content-delivery server 30.

[0011] A variety of encoding processes are available. In many cases,these encoding processes are lossy. For example, certain encodingprocesses will discard high-frequency components of an image under theassumption that, when the image is later decoded, the absence of thosehigh-frequency components will not be apparent to the user. Whether thisis indeed the case will depend in part on the features of the image.

[0012] In addition to being transmitted to the client 32 over the globalcomputer network 34, the encoded media 40 at the output of the encoder38 is also provided to the input of a first decoder 42, shown in FIG. 2as being associated with the aggregating server 36. The first decoder 42recovers the original media stream to the extent that the possibly lossyencoding performed by the encoder 38 makes it possible to do so.

[0013] The output of the decoding process is then provided to a firstfeature extractor 44, also executing on the aggregating server 36. Thefirst feature extractor 44 implements known feature extractionalgorithms for extracting temporal or spatial features of the encodedmedia 40. Known feature extraction methods include the Sarnoff JND(“Just Noticeable Difference”) method and the methods disclosed in ANSIT1.801.03-1996 (“American National Standard forTelecommunications—Digital Transport of One Way Video Signals Parametersfor Objective Performance Specification”) specification.

[0014] A typical feature-extractor might evaluate a discrete cosinetransform (“DCT”) of an image or a portion of an image. The distributionof high and low frequencies in the DCT would provide an indication ofhow much detail is in any particular image. Changes in the distributionof high and low frequencies in DCTs of different images would provide anindication of how rapidly images are changing with time, and hence howmuch “action” is actually in the moving image.

[0015] The original media 39 is also passed through a second featureextractor 46 identical to the first feature extractor 44. The outputs ofthe first and second feature extractors 44, 46 are then compared by afirst analyzer 48. This comparison results in the calculation of anencoding metric indicative of the extent to which the subjectiveperception of a user would be degraded by the encoding and decodingalgorithms by themselves.

[0016] An analyzer compares DCTs of two images, both of which aretypically matrix quantities, and maps the difference to a scalar. Theoutput of the analyzer is typically a dimensionless quantity between 0and 1 that represents a normalized measure of how different thefrequency distribution of two images are.

[0017] The content-delivery server 30 transmits the encoded media 40 tothe user by placing it on the global computer network 34. Once on theglobal computer network 34, the encoded media 40 is subjected to thevarious difficulties that are commonly encountered when transmittingdata of any type on such a network 34. These include jitter, packetloss, and packet latency. In one embodiment, statistics on these andother measures of transmission error are collected by a networkperformance monitor 52 and made available to the aggregating server 36.

[0018] The media stream received by the client 32 is then provided to asecond decoder 54 identical to the first decoder 42. A decoded stream 56from the output of the second decoder 54 is made available for displayto the end-user. In addition, the decoded stream 56 is passed through athird feature extractor 58 identical to the first and second featureextractors 44, 46. The output of the third feature extractor 58 isprovided to a second analyzer 60.

[0019] The inputs to both the first and third feature extractor 44, 58have been processed by the same encoder 38 and by identical decoders 42,54. However, unlike the input to the third feature extractor 58, theinput to the first feature extractor 44 was never transported across thenetwork 34. Hence, any difference between the outputs of the first andthird feature extractors 44, 58 can be attributed to transmission errorsalone. This difference is determined by second analyzer 60, whichcompares the outputs of the first and third feature extractors 44, 58.On the basis of this difference, the second analyzer 60 calculates atransmission metric indicative of the extent to which the subjectiveperception of a user would be degraded by the transmission error alone.

[0020] The system 28 thus provides an estimate of a user's perception ofthe quality of a media stream on the basis of features in the renderedstream. This estimate is separable into a first portion that dependsonly on encoding error and a second portion that depends only ontransmission error.

[0021] Having determined a transmission metric, it is useful to identifythe relative effects of different types of transmission errors on thetransmission metric. To do so, the network statistics obtained by thenetwork performance monitor 52 and the transmission metric determined bythe second analyzer 60 are provided to a correlator 62. The correlator62 can then correlate the network statistics with values of thetransmission metric. The result of this correlation identifies thosetypes of network errors that most significantly affect the end-user'sexperience.

[0022] In one embodiment, the correlator 62 averages network statisticsover a fixed time-interval and compares averages thus generated withcorresponding averages of transmission metrics for that time-interval.This enables the correlator 62 to establish, for that time interval,contributions of specific network impairments, such as jitter, packetloss, and packet latency, toward the end-user's experience.

[0023] Although the various processes are shown in FIG. 1 as executingon specific servers, this is not a requirement. For example, the system28 can also be configured so that the first decoder 42 executes on thecontent-delivery server 30 rather than on the aggregating server 36 asshown in FIG. 1. In one embodiment, the output of the first featureextractor is sent to the client and the second analyzer executes at theclient rather than at the aggregating server 36. The server selected toexecute a particular process depends, to a great extent, on loadbalancing.

[0024] Other embodiments are within the scope of the following claims.

We claim:
 1. A method comprising: obtaining reference data thatcharacterizes a media stream, obtaining altered data that characterizessaid media stream after said media stream has traversed a channel thatincludes a network; and determining a quality of service of said channelon the basis of a comparison of said reference data and said altereddata.
 2. The method of claim 1, wherein said reference datacharacterizes a feature of said media stream; and said altered datacharacterizes a feature of said media stream after said media stream hastraversed said channel.
 3. The method of claim 1, wherein obtaining atleast one of said reference and said altered data comprises applying aSarnoff JND algorithm or an ANSI T1.801.03 algorithm.
 4. The method ofclaim 2, wherein determining a quality of service of said channelcomprises comparing said first reference data and said altered data. 5.The method of claim 1, further comprising: obtaining network statisticsassociated with transmission on said channel; and correlating saidnetwork statistics with said altered data.
 6. The method of claim 5,further comprising selecting said network statistics from the groupconsisting of jitter, packet loss, and packet latency.
 7. The method ofclaim 1, further comprising selecting said channel to include: anencoder for creating an encoded representation of said media stream; adecoder for recovering said media stream from said encodedrepresentation; and a computer network between said encoder and saiddecoder.
 8. The method of claim 1, wherein obtaining said reference datacomprises: passing said media stream through an encoder to generate anencoded signal; passing said encoded signal through a decoder togenerate a decoded media stream; and passing said decoded media streamthrough a feature extractor to extract said reference data.
 9. A systemcomprising: a first feature extractor for generating reference datacharacterizing a media stream; a second feature extractor for generatingaltered data characterizing said media stream after said media streamhas traversed a channel that includes a network; and an analyzer forcomparing said reference data and said altered data to generate atransmission metric indicative of a quality of service.
 10. The systemof claim 9, further comprising a correlator in communication with saidanalyzer, said correlator being configured to correlate networkstatistics associated with said channel with said transmission metric.11. The system of claim 10, further comprising a network monitor incommunication with said correlator, said network monitor beingconfigured to collect said network statistics.
 12. The system of claim10, wherein said correlator is configured to correlate statisticsselected from the group consisting of: jitter, packet loss, and packetlatency.
 13. The system of claim 9, wherein said first and secondfeature extractors are configured to extract media features using analgorithm selected from the group consisting of: the Sarnoff JNDalgorithm and the ANSI T1.801.03 algorithm
 14. A computer-readablemedium having software encoded thereon, said software comprisinginstructions for: obtaining reference data that characterizes a mediastream, obtaining altered data that characterizes said media streamafter said media stream has traversed a channel that includes a network;and determining a quality of service of said channel on the basis of acomparison of said reference data and said altered data.
 15. Thecomputer-readable medium of claim 14, wherein said instructions forobtaining reference data include instructions for generating referencedata characterizing a feature of said media stream; and saidinstructions for obtaining altered data comprise instructions forgenerating altered data that characterizes a feature of said mediastream after said media stream has traversed said channel.
 16. Thecomputer-readable medium of claim 14, wherein said instructions forobtaining at least one of said reference and said altered data compriseinstructions for applying a Sarnoff JND algorithm or an ANSI T1.801.03algorithm.
 17. The computer-readable medium of claim 15, wherein saidinstructions for determining a quality of service of said channelcomprise instructions for comparing said first reference data and saidaltered data.
 18. The computer-readable medium of claim 14, wherein saidsoftware further comprises instructions for: obtaining networkstatistics associated with transmission on said channel; and correlatingsaid network statistics with said altered data.
 19. Thecomputer-readable medium of claim 18, wherein said software furthercomprises instructions for selecting said network statistics from thegroup consisting of jitter, packet loss, and packet latency.
 20. Thecomputer-readable medium of claim 14, wherein said software furthercomprises instructions for selecting said channel to include: an encoderfor creating an encoded representation of said media stream; a decoderfor recovering said media stream from said encoded representation; and acomputer network between said encoder and said decoder.
 21. Thecomputer-readable medium of claim 14, wherein said instructions forobtaining said reference data comprise instructions for: passing saidmedia stream through an encoder to generate an encoded signal; passingsaid encoded signal through a decoder to generate a decoded mediastream; and passing said decoded media stream through a featureextractor to extract said reference data.